Constraint-Based Discovery and Inductive Queries: Application to Association Rule Mining

نویسندگان

  • Baptiste Jeudy
  • Jean-François Boulicaut
چکیده

Recently inductive databases (IDBs) have been proposed to afford the problem of knowledge discovery from huge databases. Querying these databases needs for primitives to: (1) select, manipulate and query data, (2) select, manipulate and query “interesting” patterns (i.e., those patterns that satisfy certain constraints), and (3) cross over patterns and data (e.g., selecting the data in which some patterns hold). Designing such query languages is a long-term goal and only preliminary approaches have been studied, mainly for the association rule mining task. Starting from a discussion on the MINE RULE operator, we identify several open issues for the design of inductive databases dedicated to these descriptive rules. These issues concern not only the offered primitives but also the availability of efficient evaluation schemes. We emphasize the need for primitives that work on more or less condensed representations for the frequent itemsets, e.g., the (frequent) δ-free and closed itemsets. It is useful not only for optimizing single association rule mining queries but also for sophisticated post-processing and interactive rule mining.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Survey on Condensed Representations for Frequent Sets

Solving inductive queries which have to return complete collections of patterns satisfying a given predicate has been studied extensively the last few years. The specific problem of frequent set mining from potentially huge boolean matrices has given rise to tens of efficient solvers. Frequent sets are indeed useful for many data mining tasks, including the popular association rule mining task ...

متن کامل

Using Constraints During Set Mining: Should We Prune or not?

Knowledge discovery in databases (KDD) is an interactive process that can be considered from a querying perspective. Within the inductive database framework, an inductive query on a database is a query that might return generalizations about the data e.g., frequent itemsets, association rules, data dependencies. To study evaluation schemes of such queries, we focus on the simple case of (freque...

متن کامل

Condensed representations for data mining

INTRODUCTION Condensed representations have been proposed in (Mannila & Toivonen, 1996) as a useful concept for the optimization of typical data mining tasks. It appears as a key concept Raedt, 2002) and this paper introduces this research domain, its achievements in the context of frequent itemset mining (FIM) from transactional data and its future trends. Within the inductive database framewo...

متن کامل

SPADA: A Spatial Association Discovery System*

This paper presents a spatial association discovery system, named SPADA, which has been developed according to the theoretical framework of inductive databases. Our approach considers inductive databases as deductive databases with an integrated inductive component and relies on techniques borrowed from the field of Inductive Logic Programming (ILP). In SPADA, an ILP module supports the process...

متن کامل

Extending the Soft Constraint Based Mining Paradigm

The paradigm of pattern discovery based on constraints has been recognized as a core technique in inductive querying: constraints provide to the user a tool to drive the discovery process towards potentially interesting patterns, with the positive side effect of achieving a more efficient computation. So far the research on this paradigm has mainly focussed on the latter aspect: the development...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002